CIs and p-values

Much of the inference part of conventional intro stats is about computing confidence intervals and p-values in a variety of settings:

  • difference of means
  • difference of proportions
  • slope of regression line

Formulas from Triola

No statistics were harmed in the filming of this documentary

In the following, spread is measured using the length of the 85% summary interval.

You can still do all this by using the standard deviation instead.

Star notation

The usual notation for statistical significance is something like p < 0.04.

In regression reports, statistical significance is appreviated with stars:

  • ★ p < .05
  • ★★ p < .01
  • ★★★ p < .001

We want at the same time to be able

  1. To satisfy the traditionalists.
  2. To put “significance” on a scale that doesn’t lead to familiar but fallacious probability interpretations, e.g. “The probability that the Null is true is 0.01.”
  3. To make it clear that 0.05 is a weak standard for significance.
  4. To remind people that things other than “statistical significance” are important.

My proposal: Amazon-like ratings

  • No stars: anecdotal, no sampling plan
  • One star: sampling plan and p < 0.05
  • Two stars: : one star + p < 0.01
  • Three stars: two stars + p < 0.001
  • Four stars: three stars and covariates considered
  • Five stars: four stars and effect size reaches a magnitude of practical importance.

The procedure

  1. Plot out data, draw in model.
  2. From data and model, find R and sample size n. The picture will tell you R.
  3. Measure effect size, e.g. difference in means or slope of regression line. Call it \(\Delta\).
  4. Calculate \(F = n-2 \frac{R^2}{1-R^2}\).
  5. 95% confidence interval on \(\Delta\) is \(\Delta \pm \sqrt{4/F}\).
  6. Old-timers call it “significance”, but let’s call it … ?
    • \(F = 4\) – one star (corresponds to about 0.05)
    • \(F = 7\) – two stars (corresponds to about 0.01)
    • \(F = 12\) – three stars (corresponds to about 0.001)

The F table

Maybe blue, red, white, yellow

Calculating F

## Warning: Removed 8 rows containing missing values (geom_path).

## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.

Difference of means

## Joining, by = "sex"

The ratio of the intervals is \(R \approx 0.25\) (and \(n = 39)\).

  • Difference between the means: read it off the graph: ∆.

  • CI on difference between means: ∆ (1 ± \(\sqrt{\frac{1-R^2}{n-2}})\)

  • Significance” is \((n-2) \frac{R^2}{1-R^2}\)
    • One star when 4, two stars when 7, three stars when 12.

Regression slope

The ratio of the vertical intervals is about \(R \approx 0.65\) (and \(n = 39\)).

  • Slope is ratio of rise interval over run interval. Call it ∆.

  • CI on slope: ∆ (1 ± \(\sqrt{\frac{1-R^2}{n-2}})\)

  • Significance” is \(F = (n-2)\frac{R^2}{1-R^2}\)
    • One star when 4, two stars when 7, three stars when 12.

Difference in proportions

Fill in an example, say, domhand versus sex

Slope of proportion

Fill in an example, say sex versus width.

Multiple regression